Search and Navigation in Semantically Integrated Document Collections
نویسندگان
چکیده
The paper presents a novel approach to semantic search and navigation in office-like document collections. The approach is based on a semantic document model that we have developed to enable unique identification, semantic annotation, and semantic linking of document units of officelike documents. In order to semantically annotate document units and to link semantically related document units, we first conceptualize document units’ semantics and represent them by vectors of ontological concepts and their corresponding weight vectors. In the semantic search, we represent a user query by a query’s concept vector, which is generated in the same way as document units’ concept vectors, and then determine the search results by measuring the similarity between the query’s and the document units’ concept weight vectors. After the search, by following the semantic links of a selected document unit, the user can navigate through the document collection and discover semantically related document units. Results of the preliminary evaluation, conducted with a prototype implementation, are promising. We present a brief analysis of these results. Keywords-semantic search, semantic linking and navigation;
منابع مشابه
Metadata for Multidimensional Categorization and Navigation Support on Multimedia Documents
An increasing technological effort is spent on integrated representations of document collections and metadata. For instance the emerging XML standard offers opportunities to represent metadata in for, e.g., improving query and navigation support within web-based document collections. Despite this development, most applications of catalogue metaphors on the web ranging from small web site catal...
متن کاملA Generic Architecture for the Conversion of Document Collections into Semantically Annotated Digital Archives
Mass digitization of document collections with further processing and semantic annotation is an increasing activity among libraries and archives at large for preservation, browsing and navigation, and search purposes. In this paper we propose a software architecture for the process of converting high volumes of document collections to semantically annotated digital libraries. The proposed archi...
متن کاملQuerying Structured XML Document Collections
The number of XML document collections is increasing, and it’s important to effectively query them. Document semantics is in both the text and the structure. In this paper we describe a query interface towards XML document collections. The interface is automatically tailored to the document structure, as described by its XML Schema. External schema annotation in RDF contains information used to...
متن کاملQuerying Xml Document Collections
In this paper we describe a query interface towards XML document collections. External schema annotation in RDF contains information used to dynamically build the interface tailored to the user’s characteristics and to the document structure, as described by its XML Schema. The interface makes the user aware of structure semantics, so supporting her/him in formulating semantically correct queri...
متن کاملInformation Architecture of Research Institutes’ Website, Case Study: Iranian Research Institute for Information Science and Technology’s Website
Purpose: As mission-oriented organizations, research institutes have the task of answering community questions in specialized areas, and should therefore be able to effectively present their outputs to their target users. Achieving such a goal requires the proper use of information architecture principles to properly organize the information platform in which the research institutes interact wi...
متن کامل